Predicting the subcellular localization of mycobacterial proteins by incorporating the optimal tripeptides into the general form of pseudo amino acid composition

Mol Biosyst. 2015 Feb;11(2):558-63. doi: 10.1039/c4mb00645c. Epub 2014 Dec 1.

Abstract

Mycobacterium tuberculosis is a bacterium that causes tuberculosis, one of the most prevalent infectious diseases. Predicting the subcellular localization of mycobacterial proteins in this bacterium may provide vital clues for the prediction of protein function as well as for drug discovery and design. Therefore, a computational method that can predict the subcellular localization of mycobacterial proteins with high precision is highly desirable. We propose a computational method to predict the subcellular localization of mycobacterial proteins. An objective and strict benchmark dataset was constructed after collecting 272 non-redundant proteins from the universal protein resource (the UniProt database). Subsequently, a novel feature selection strategy based on binomial distribution was used to optimize the feature vector. Finally, a subset containing 219 chosen tripeptide features was imported into a support vector machine-based method to estimate the performance of the dataset in accurately and sensitively identifying these proteins. We found that the proposed method gave a maximum overall accuracy of 89.71% with an average accuracy of 81.12% in the jackknife cross-validation. The results indicate that our prediction method gave an efficient and powerful performance when compared with other published methods. We made the proposed method available on a purpose built Web server called MycoSub that is freely accessible at . We anticipate that MycoSub will become a useful tool for studying the functions of mycobacterial proteins and for designing and developing anti-mycobacterium drugs.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acids / metabolism*
  • Bacterial Proteins / metabolism*
  • Databases, Protein
  • Mycobacterium tuberculosis / metabolism*
  • Peptides / metabolism*
  • Protein Transport
  • Subcellular Fractions / metabolism

Substances

  • Amino Acids
  • Bacterial Proteins
  • Peptides